45 research outputs found

    Comparing hard and overlapping clusterings

    Get PDF
    Similarity measures for comparing clusterings is an important component, e.g., of evaluating clustering algorithms, for consensus clustering, and for clustering stability assessment. These measures have been studied for over 40 years in the domain of exclusive hard clusterings (exhaustive and mutually exclusive object sets). In the past years, the literature has proposed measures to handle more general clusterings (e.g., fuzzy/probabilistic clusterings). This paper provides an overview of these new measures and discusses their drawbacks. We ultimately develop a corrected-for-chance measure (13AGRI) capable of comparing exclusive hard, fuzzy/probabilistic, non-exclusive hard, and possibilistic clusterings. We prove that 13AGRI and the adjusted Rand index (ARI, by Hubert and Arabie) are equivalent in the exclusive hard domain. The reported experiments show that only 13AGRI could provide both a fine-grained evaluation across clusterings with different numbers of clusters and a constant evaluation between random clusterings, showing all the four desirable properties considered here. We identified a high correlation between 13AGRI applied to fuzzy clusterings and ARI applied to hard exclusive clusterings over 14 real data sets from the UCI repository, which corroborates the validity of 13AGRI fuzzy clustering evaluation. 13AGRI also showed good results as a clustering stability statistic for solutions produced by the expectation maximization algorithm for Gaussian mixture

    Impacto da reeducação funcional respiratória na pessoa com derrame pleural : Uma revisão sistemática da literatura

    Get PDF
    Introdução: O derrame pleural define-se como a acumulação anormal de líquido no espaço pleural, podendo ser causado por um significativo número de situações patológicas e originar importantes complicações, dependendo o seu tratamento da causa e da dimensão do derrame. A reeducação funcional respiratória, levada a efeito pelos enfermeiros especialistas em enfermagem de reabilitação, ao englobar um conjunto de técnicas que atuam na respiração com implicações diretas na mecânica alveolar, é vista como uma intervenção potenciadora da redução dos sintomas e otimização da funcionalidade. Neste contexto, o objetivo deste estudo pretende determinar de que forma a reeducação funcional respiratória tem impacto nas pessoas com derrame pleural. Métodos: Foi realizada uma revisão sistemática da literatura sobre estudos que avaliavam o impacto da reeducação funcional respiratória no derrame pleural. Fez-se pesquisa na PUBMED, EBSCO, Google Académico e SciELO de estudos publicados entre janeiro de 2008 e maio de 2017 que foram posteriormente avaliados, respeitando os critérios de inclusão e exclusão previamente estabelecidos. Resultados: Três estudos preencheram os critérios de inclusão, cujos resultados revelam que em pessoas com derrame pleural pretende-se, com a reeducação funcional respiratória, impedir a formação de aderências pleurais, evitar a limitação da mobilidade toraco-pulmonar e diafragmática; impedir ou corrigir as posições antiálgicas defeituosas e as suas consequências, impedir as deformações posturais como a retração do hemitórax comprometido e limitação da articulação escápulo-umeral; incentivar a expansão pulmonar e promover a reabsorção do derrame pleural com a finalidade de melhorar a performance pulmonar. Também as evidências encontradas nos permitiram elaborar um plano de intervenção direccionado à pessoa com derrame pelural. Conclusão: O programa de reeducação funcional respiratória é uma mais-valia como tratamento coadjuvante, trazendo benefícios significativos para as pessoas ao nível da performance pulmonar, assim como na diminuição do tempo de internamento. Palavras-chave: Reabilitação Respiratória, Exercícios Respiratórios; Reeducação Funcional Respiratória, Derrame Pleural.Abstract Introduction: Pleural effusion is defined as the abnormal accumulation of fluid in the pleural space, which can be caused by a significant number of pathological conditions and cause major complications, depending on the treatment of the cause and size of the effusion. Respiratory functional reeducation, carried out by nurses specialized in rehabilitation nursing, encompassing a set of breathing techniques with direct implications in alveolar mechanics, is seen as an intervention that enhances the reduction of symptoms and optimization of functionality. In this context, the aim of this study is to determine how functional respiratory reeducation affects people with pleural effusion. Methods: A systematic review of the literature on studies evaluating the impact of respiratory functional reeducation on pleural effusion was carried out. PUBMED, EBSCO, Google Scholar and SciELO were searched for studies published between January 2008 and May 2017 that were subsequently evaluated, respecting the previously established inclusion and exclusion criteria. RESULTS: Three studies fulfilled the inclusion criteria, and the results show that in people with pleural effusion it is intended, with functional respiratory reeducation, to prevent the formation of pleural adhesions, to avoid the limitation of thoroco-pulmonary and diaphragmatic mobility; preventing or correcting defective analgesic positions and their consequences, preventing postural deformations such as compromised hemithorax retraction and limitation of the sputum-humeral joint; encourage lung expansion and promote the reabsorption of pleural effusion in order to improve pulmonary performance. Also the evidences found allowed us to elaborate a plan of intervention directed to the person with pelural effusion. Conclusion: The respiratory functional re-education program is an added value as an adjunct treatment, bringing significant benefits to patients in terms of pulmonary performance and decreased length of hospital stay. Keywords: Respiratory Rehabilitation, Respiratory Exercises; Respiratory Functional Reeducation, Pleural Effusion

    A carga interna de treinamento é diferente entre atletas de voleibol titulares e reservas? Um estudo de caso

    Get PDF
    The same training stimulus can provide different physiological adaptations for athletes of the same team. Thus, the aim of this study was to analyze and compare the load training of starters and nonstarters players, athletes of a men’s volleyball team at different times of the season. The sample consisted of fifteen men’s volleyball superleague athletes who were divided into two groups of starters and nonstarters players. The training load of the ten weeks of the team’s preparation period for the main championship season in which no games were performed was selected for the study. The method of subjective perceived of effort (session-RPE) proposed by Foster et al. (2001) was used to quantify the training load. The group of starters players had higher total weekly training load (TWTL) and RPE values in the average of the ten weeks of training (p<0.05). Higher TWTL values for starters players in the preparatory and pre-competitive period compared to nonstarters players was also demonstrated (p<0.05). When different weeks were analyzed separately, weeks three and seven presented higher TWTL and RPE values for starters players compared with nonstarters players (p<0.05). The results presented in this study showed that starters players showed greater internal training load compared to nonstarters players.O mesmo estímulo de treinamento pode proporcionar diferentes adaptações fisiológicas para os atletas de uma mesma equipe. Dessa forma, o objetivo do estudo foi analisar e comparar as cargas de treinamento de atletas titulares e reservas de uma equipe de voleibol masculino em diferentes períodos da temporada. A amostra foi composta por quinze atletas da superliga masculina de vôlei que foram divididos em dois grupos de titulares e reservas. A carga de treinamento de dez semanas pertencentes ao período de preparação da equipe para o campeonato principal da temporada em que não houve a realização de jogos foi selecionada para o estudo. Foi utilizado para a quantificação da carga de treinamento o método da percepção subjetiva do esforço da sessão (PSE) proposto por Foster et al. (2001). O Grupo titular apresentou maiores valores de carga de treinamento semanal total (CTST) e de PSE na média das 10 semanas de treinamento (p<0,05). Foi demonstrado também maior valor de CTST para os titulares no período preparatório e competitivo em relação aos reservas (p<0,05). Quando as diferentes semanas foram analisadas separadamente, a semana 3 e 7 apresentaram a CTST e a PSE maiores para os titulares comparado com os reservas (p<0,05). Os resultados apresentados pelo presente estudo mostraram que atletas considerados titulares apresentaram maior carga interna de treinamento em comparação aos atletas considerados reservas

    The COVID-19 pandemic: a letter to G20 leaders

    Get PDF

    MAMMALS IN PORTUGAL : A data set of terrestrial, volant, and marine mammal occurrences in P ortugal

    Get PDF
    Mammals are threatened worldwide, with 26% of all species being includedin the IUCN threatened categories. This overall pattern is primarily associatedwith habitat loss or degradation, and human persecution for terrestrial mam-mals, and pollution, open net fishing, climate change, and prey depletion formarine mammals. Mammals play a key role in maintaining ecosystems func-tionality and resilience, and therefore information on their distribution is cru-cial to delineate and support conservation actions. MAMMALS INPORTUGAL is a publicly available data set compiling unpublishedgeoreferenced occurrence records of 92 terrestrial, volant, and marine mam-mals in mainland Portugal and archipelagos of the Azores and Madeira thatincludes 105,026 data entries between 1873 and 2021 (72% of the data occur-ring in 2000 and 2021). The methods used to collect the data were: live obser-vations/captures (43%), sign surveys (35%), camera trapping (16%),bioacoustics surveys (4%) and radiotracking, and inquiries that represent lessthan 1% of the records. The data set includes 13 types of records: (1) burrowsjsoil moundsjtunnel, (2) capture, (3) colony, (4) dead animaljhairjskullsjjaws, (5) genetic confirmation, (6) inquiries, (7) observation of live animal (8),observation in shelters, (9) photo trappingjvideo, (10) predators dietjpelletsjpine cones/nuts, (11) scatjtrackjditch, (12) telemetry and (13) vocalizationjecholocation. The spatial uncertainty of most records ranges between 0 and100 m (76%). Rodentia (n=31,573) has the highest number of records followedby Chiroptera (n=18,857), Carnivora (n=18,594), Lagomorpha (n=17,496),Cetartiodactyla (n=11,568) and Eulipotyphla (n=7008). The data setincludes records of species classified by the IUCN as threatened(e.g.,Oryctolagus cuniculus[n=12,159],Monachus monachus[n=1,512],andLynx pardinus[n=197]). We believe that this data set may stimulate thepublication of other European countries data sets that would certainly contrib-ute to ecology and conservation-related research, and therefore assisting onthe development of more accurate and tailored conservation managementstrategies for each species. There are no copyright restrictions; please cite thisdata paper when the data are used in publications.info:eu-repo/semantics/publishedVersio

    Mammals in Portugal: a data set of terrestrial, volant, and marine mammal occurrences in Portugal

    Get PDF
    Mammals are threatened worldwide, with ~26% of all species being included in the IUCN threatened categories. This overall pattern is primarily associated with habitat loss or degradation, and human persecution for terrestrial mammals, and pollution, open net fishing, climate change, and prey depletion for marine mammals. Mammals play a key role in maintaining ecosystems functionality and resilience, and therefore information on their distribution is crucial to delineate and support conservation actions. MAMMALS IN PORTUGAL is a publicly available data set compiling unpublished georeferenced occurrence records of 92 terrestrial, volant, and marine mammals in mainland Portugal and archipelagos of the Azores and Madeira that includes 105,026 data entries between 1873 and 2021 (72% of the data occurring in 2000 and 2021). The methods used to collect the data were: live observations/captures (43%), sign surveys (35%), camera trapping (16%), bioacoustics surveys (4%) and radiotracking, and inquiries that represent less than 1% of the records. The data set includes 13 types of records: (1) burrows | soil mounds | tunnel, (2) capture, (3) colony, (4) dead animal | hair | skulls | jaws, (5) genetic confirmation, (6) inquiries, (7) observation of live animal (8), observation in shelters, (9) photo trapping | video, (10) predators diet | pellets | pine cones/nuts, (11) scat | track | ditch, (12) telemetry and (13) vocalization | echolocation. The spatial uncertainty of most records ranges between 0 and 100 m (76%). Rodentia (n =31,573) has the highest number of records followed by Chiroptera (n = 18,857), Carnivora (n = 18,594), Lagomorpha (n = 17,496), Cetartiodactyla (n = 11,568) and Eulipotyphla (n = 7008). The data set includes records of species classified by the IUCN as threatened (e.g., Oryctolagus cuniculus [n = 12,159], Monachus monachus [n = 1,512], and Lynx pardinus [n = 197]). We believe that this data set may stimulate the publication of other European countries data sets that would certainly contribute to ecology and conservation-related research, and therefore assisting on the development of more accurate and tailored conservation management strategies for each species. There are no copyright restrictions; please cite this data paper when the data are used in publications

    Evolutionary approaches to relational data clustering

    No full text
    O agrupamento de dados é uma técnica fundamental em aplicações de diversos campos do mercado e da ciência, como, por exemplo, no comércio, na biologia, na psiquiatria, na astronomia e na mineração da Web. Ocorre que em um subconjunto desses campos, como engenharia industrial, ciências sociais, engenharia sísmica e recuperação de documentos, as bases de dados são usualmente descritas apenas pelas proximidades entre os objetos (denominadas bases de dados relacionais). Mesmo em aplicações nas quais os dados não são naturalmente relacionais, o uso de bases relacionais permite que os dados em si sejam mantidos sob sigilo, o que pode ser de grande valia para bancos ou corretoras, por exemplo. Nesta dissertação é apresentada uma revisão de algoritmos de agrupamento de dados que lidam com bases de dados relacionais, com foco em algoritmos que produzem partições rígidas (hard ou crisp) dos dados. Particular ênfase é dada aos algoritmos evolutivos, que têm se mostrado capazes de resolver problemas de agrupamento de dados com relativa acurácia e de forma computacionalmente eficiente. Nesse contexto, propõe-se nesta dissertação um novo algoritmo evolutivo de agrupamento capaz de operar sobre dados relacionais e também capaz de estimar automaticamente o número de grupos nos dados (usualmente desconhecido em aplicações práticas). É demonstrado empiricamente que esse novo algoritmo pode superar métodos tradicionais da literatura em termos de eficiência computacional e acuráciaData clustering is a fundamental technique for applications in several fields of science and marketing, as commerce, biology, psychiatry, astronomy, and Web mining. However, in a subset of these fields, such as industrial engineering, social sciences, earthquake engineering, and retrieval of documents, datasets are usually described only by proximities between their objects (called relational datasets). Even in applications where the data are not naturally relational, the use of relational datasets preserves the datas secrecy, which can be of great value to banks or brokers, for instance. This dissertation presents a review of data clustering algorithms which deals with relational datasets, with a focus on algorithms that produce hard or crisp partitions of data. Particular emphasis is given to evolutionary algorithms, which have proved of being able to solve problems of data clustering accurately and efficiently. In this context, we propose a new evolutionary algorithm for clustering able to operate on relational datasets and also able to automatically estimate the number of clusters (which is usually unknown in practical applications). It is empirically shown that this new algorithm can overcome traditional methods described in the literature in terms of computational efficiency and accurac

    Algorithms and validation techniques in multi-represented data clustering, possibilistic clustering and bi-clustering

    No full text
    Existem bases para as quais os dados são naturalmente representados por mais de uma visão. Por exemplo, imagens podem ser descritas por atributos de cores, textura e forma. Proteínas podem ser caracterizadas pela sequência de aminoácidos e pela representação tridimensional. A unificação das diferentes visões de uma base de dados pode ser problemática porque elas podem não ser comparáveis entre si ou podem apresentar diferentes graus de importância. Esses graus de importância podem, inclusive, se manifestar de maneira local, de acordo com a subestrutura dos dados em questão. Isso motivou o surgimento de algoritmos de agrupamento de dados capazes de lidar com bases multi-representadas (i.e., que possuem mais de uma visão dos dados), como o algoritmo SCAD. Esse algoritmo se mostrou promissor em experimentos relatados na literatura, mas possui problemas críticos identificados neste trabalho que o impedem de funcionar em determinados cenários. Tais problemas foram solucionados por meio da proposição de uma nova versão do algoritmo, denominada ASCAD, fundamentada em provas formais sobre a sua convergência. Foram desenvolvidas versões relacionais do algoritmo ASCAD, capazes de lidar com bases descritas apenas por relações de proximidade entre os objetos. Foi desenvolvido também um índice de validação interna e relativa de agrupamento voltado para dados multi-representados. A avaliação de agrupamento possibilístico e de bi-agrupamento por meio da comparação entre solução encontrada e solução de referência (validação externa) também foi explorada. Algoritmos de bi-agrupamento têm ganhado um interesse crescente da comunidade de análise de expressão gênica. No entanto, pouco se conhece do comportamento e das propriedades das medidas voltadas para validação externa de bi-agrupamento, o que motivou uma análise teórica e empírica dessas medidas. Essa análise mostrou que a maioria das medidas de biagrupamento possui problemas críticos e destacou duas delas como sendo as mais promissoras. Foram inclusas nessa análise três medidas de agrupamento particional não exclusivo, cujo uso na comparação de bi-agrupamentos é possível por meio de uma nova abordagem de avaliação de bi-agrupamento proposta nesta tese. Agrupamento particional não exclusivo faz parte de um domínio mais geral de soluções, i.e., o domínio dos agrupamentos possibilísticos. Observou-se algumas falhas conceituais importantes das medidas de agrupamento possibilístico, o que motivou o desenvolvimento de novas medidas e de uma análise empírica e conceitual envolvendo 34 medidas. Uma das medidas propostas se destacou como sendo a única que apresentou avaliações imparciais com relação ao número de grupos, o valor máximo de similaridade ao comparar a solução ideal encontrada com a solução de referência e avaliações sensíveis às diferenças das soluções em todos os cenários consideradosThere are data sets for which the instances are naturally represented by more than one view. For example, images can be described by attributes of color, texture, and shape. Proteins can be characterized by the amino acid sequence and by their three-dimensional description. The unification of different views of a data set can be problematic because they may not be comparable or may have different degrees of importance. These degrees of importance may even manifest itself locally, according to the data substructures. This prompted the emergence of clustering algorithms capable of handling multi-represented data sets (i.e., data sets having more than one view) as the SCAD algorithm. This algorithm has shown promising results in experiments reported in the literature, but it has critical problems identified in this work that hinder its application in certain scenarios. These problems were solved here by proposing a new version of the algorithm, called ASCAD, based on formal proofs about its correctness. We developed relational versions for ASCAD, capable of handling data sets described only by the proximities between the instances. We also developed an index for internal and relative validation of multi-represented data clusterings. The evaluation of possibilistic clustering and bi-clustering by comparing the found and reference solutions (external validation) was also explored. Bi-clustering algorithms have gained increasing interest from the community of gene expression analysis. However, little is known of the behavior and properties of the measures aimed at external validation of bi-clustering, which motivated a theoretical and empirical analysis of these measures in this work. This analysis showed that most bi-clustering measures has critical issues and highlighted two of the measures as being the most promising. We included in this analysis three measures of non-exclusive partitional clustering, whose use in comparing bi-clusterings is possible through a new approach proposed in this thesis. Non-exclusive partitional clustering belong to a more general domain of solutions, i.e., the domain of possibilistic clusterings. There are some important conceptual flaws in the measures of possibilistic clustering, which motivated us to develop new measures and to conceptually and empirically analyse 34 measures. One of the proposed measures stood out as being the one who presented unbiased evaluations regarding the number of clusters, the maximum similarity when comparing the optimal solution with the reference one, and evaluations sensitive to solution differences in all scenarios considere

    Similarity measures for comparing biclusterings

    No full text
    The comparison of ordinary partitions of a set of objects is well established in the clustering literature, which comprehends several studies on the analysis of the properties of similarity measures for comparing partitions. However, similarity measures for clusterings are not readily applicable to biclusterings, since each bicluster is a tuple of two sets (of rows and columns), whereas a cluster is only a single set (of rows). Some biclustering similarity measures have been defined as minor contributions in papers which primarily report on proposals and evaluation of biclustering algorithms or comparative analyses of biclustering algorithms. The consequence is that some desirable properties of such measures have been overlooked in the literature. We review 14 biclustering similarity measures. We define eight desirable properties of a biclustering measure, discuss their importance, and prove which properties each of the reviewed measures has. We show examples drawn and inspired from important studies in which several biclustering measures convey misleading evaluations due to the absence of one or more of the discussed properties. We also advocate the use of a more general comparison approach that is based on the idea of transforming the original problem of comparing biclusterings into an equivalent problem of comparing clustering partitions with overlapping clusters

    Automatic aspect discrimination in relational data clustering

    No full text
    The features describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that performs fuzzy clustering and aspects weighting simultaneously was recently proposed. However, there are several situations where the data set is represented by proximity matrices only (relational data), which renders several clustering approaches, including SCAD, inappropriate. To handle this kind of data, the relational clustering algorithm CARD, based on the SCAD algorithm, has been recently developed. However, CARD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to also reduce the number of parameters required. The improved CARD is assessed over hundreds of real and artificial data sets
    corecore